Introduction

library(plyr)
-------------------------------------------------------------------------------------------------------------------------
You have loaded plyr after dplyr - this is likely to cause problems.
If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
library(plyr); library(dplyr)
-------------------------------------------------------------------------------------------------------------------------

Attaching package: ‘plyr’

The following objects are masked from ‘package:plotly’:

    arrange, mutate, rename, summarise

The following objects are masked from ‘package:dplyr’:

    arrange, count, desc, failwith, id, mutate, rename, summarise, summarize

The following object is masked from ‘package:purrr’:

    compact

Analysis

Read in the data

Read in data from files

How many breweries are there in each state?

Heat Map of Breweies per State

Cleaning the data

Prepering/transforming the data into a usable form for analysis, visualization, etc… #### Merging Dataframes Merge beer data with the breweries data. Print the first 6 observations and the last six observations to check the merged file.


attach(beer)
The following objects are masked from beer (pos = 3):

    ABV, Beer_ID, Brewery_id, IBU, Name, Ounces, Style

The following objects are masked from beer (pos = 5):

    ABV, Beer_ID, Brewery_id, IBU, Name, Ounces, Style
beer[order(Brewery_id),] # sort the data to determine column for merge

# merge on Brewery ID
breweries_named <- rename(breweries, c("Brew_ID"="Brewery_id"))

brewing_beer <- merge(breweries_named,beer,by="Brewery_id", all=TRUE) # outter join

brewed_beer <- rename(brewing_beer, c("Name.x"="Brewery", "Name.y"="Beer")) # rename breweries and beer

head(brewed_beer,6) # show the first 6 rows of data

Missing Data

Missing data are in columns ABV (62) and IBU (1005) only. Cleaning data in multiple options: 1. complete records only 2. replacing NA with the averages of the remainder of the column

colSums(is.na(averaged_beer))
Brewery_id    Brewery       City      State       Beer    Beer_ID        ABV        IBU      Style     Ounces 
         0          0          0          0          0          0          0          0          0          0 

Median Alcohol Content

Median of Alcohol by Volume and Bitterness by State

ABV_bar <-ggplot(data=median_df, aes(x = State, y = ABV, fill = State)) +
  geom_bar(stat="identity", width = 0.75) + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + ggtitle("Median ABV by State")
ggplotly(ABV_bar)


IBU_bar <-ggplot(data=median_df, aes(x = State, y = IBU, fill = State)) +
  geom_bar(stat="identity", width = 0.75) + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + ggtitle("Median IBU by State")
ggplotly(IBU_bar)

Max ABV and IBU

Which state has the maximum alcoholic (ABV) beer? Which state has the most bitter (IBU) beer?

Summary Statistics

The summary statistics and distribution of the ABV variable.

Relationship between Bitterness and Alcohol Content

Is there an apparent relationship between the bitterness of the beer and its alcoholic content? Draw a scatter plot. Make your best judgment of a relationship and EXPLAIN your answer.

IPAs vs. Ales

Budweiser would also like to investigate the difference with respect to IBU and ABV between IPAs (India Pale Ales) and other types of Ale (any beer with “Ale” in its name other than IPA). You decide to use KNN classification to investigate this relationship. Provide statistical evidence one way or the other. You can of course assume your audience is comfortable with percentages … KNN is very easy to understand conceptually.

In addition, while you have decided to use KNN to investigate this relationship (KNN is required) you may also feel free to supplement your response to this question with any other methods or techniques you have learned. Creativity and alternative solutions are always encouraged.

Additional inferences

Knock their socks off! Find one other useful inference from the data that you feel Budweiser may be able to find value in. You must convince them why it is important and back up your conviction with appropriate statistical evidence.

LS0tCnRpdGxlOiAiQnJld2luZyBVcCBhIFN0b3JtIgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tCiMjIyBJbnRyb2R1Y3Rpb24KCmBgYHtyfQojIGxvYWQgaW4gdGhlIGxpYnJhcmllcwpsaWJyYXJ5KHRpZHl2ZXJzZSkKbGlicmFyeShkcGx5cikKbGlicmFyeSh0aWR5cikKbGlicmFyeShzdHJpbmdpKQpsaWJyYXJ5KGdncGxvdDIpCmxpYnJhcnkocGxvdGx5KQpsaWJyYXJ5KHBseXIpCmBgYAoKIyMgQW5hbHlzaXMKCiMjIyBSZWFkIGluIHRoZSBkYXRhClJlYWQgaW4gZGF0YSBmcm9tIGZpbGVzCmBgYHtyfQpiZWVyID0gcmVhZC5jc3YoZmlsZS5jaG9vc2UoKSxoZWFkZXIgPSBUUlVFKQpicmV3ZXJpZXMgPSByZWFkLmNzdihmaWxlLmNob29zZSgpLGhlYWRlciA9IFRSVUUpCgojZGlzcGxheSB0aGUgZGF0YWZyYW1lcwpiZWVyCmJyZXdlcmllcwoKYGBgCgoKIyMjIEhvdyBtYW55IGJyZXdlcmllcyBhcmUgdGhlcmUgaW4gZWFjaCBzdGF0ZT8KCmBgYHtyfQoKc3RyKGJyZXdlcmllcykgIyBjaGVjayB0aGF0IFN0YXRlIGlzIGEgRmFjdG9yCgpzdGF0ZV9icmV3ZWllcyA8LSBicmV3ZXJpZXMgJT4lIGdyb3VwX2J5KFN0YXRlKSAlPiUgdGFsbHkoKSAjIGNvdW50IHRoZSBudW1iZXIgb2YgYnJld2VyaWVzIHdpdGhpbiBhIHN0YXRlCnVuaXF1ZV9zdGF0ZV9icmV3ZWllcyA8LSBicmV3ZXJpZXMgJT4lIGdyb3VwX2J5KFN0YXRlKSAlPiUgdGFsbHkobl9kaXN0aW5jdChOYW1lKSkgIyBjaGVjayBmb3IgYW55IGR1cGxpY2F0ZXMKCmBgYAoKIyMjIyBIZWF0IE1hcCBvZiBCcmV3ZWllcyBwZXIgU3RhdGUKCmBgYHtyfQoKYGBgCgojIyMgQ2xlYW5pbmcgdGhlIGRhdGEKUHJlcGVyaW5nL3RyYW5zZm9ybWluZyB0aGUgZGF0YSBpbnRvIGEgdXNhYmxlIGZvcm0gZm9yIGFuYWx5c2lzLCB2aXN1YWxpemF0aW9uLCBldGMuLi4gCiMjIyMgTWVyZ2luZyBEYXRhZnJhbWVzCk1lcmdlIGJlZXIgZGF0YSB3aXRoIHRoZSBicmV3ZXJpZXMgZGF0YS4gUHJpbnQgdGhlIGZpcnN0IDYgb2JzZXJ2YXRpb25zIGFuZCB0aGUgbGFzdCBzaXggb2JzZXJ2YXRpb25zIHRvIGNoZWNrIHRoZSBtZXJnZWQgZmlsZS4gIApgYGB7cn0KCmF0dGFjaChiZWVyKQpiZWVyW29yZGVyKEJyZXdlcnlfaWQpLF0gIyBzb3J0IHRoZSBkYXRhIHRvIGRldGVybWluZSBjb2x1bW4gZm9yIG1lcmdlCgojIG1lcmdlIG9uIEJyZXdlcnkgSUQKYnJld2VyaWVzX25hbWVkIDwtIHJlbmFtZShicmV3ZXJpZXMsIGMoIkJyZXdfSUQiPSJCcmV3ZXJ5X2lkIikpCgpicmV3aW5nX2JlZXIgPC0gbWVyZ2UoYnJld2VyaWVzX25hbWVkLGJlZXIsYnk9IkJyZXdlcnlfaWQiLCBhbGw9VFJVRSkgIyBvdXR0ZXIgam9pbgoKYnJld2VkX2JlZXIgPC0gcmVuYW1lKGJyZXdpbmdfYmVlciwgYygiTmFtZS54Ij0iQnJld2VyeSIsICJOYW1lLnkiPSJCZWVyIikpICMgcmVuYW1lIGJyZXdlcmllcyBhbmQgYmVlcgoKaGVhZChicmV3ZWRfYmVlciw2KSAjIHNob3cgdGhlIGZpcnN0IDYgcm93cyBvZiBkYXRhCgpgYGAKCiMjIyMgTWlzc2luZyBEYXRhCk1pc3NpbmcgZGF0YSBhcmUgaW4gY29sdW1ucyBBQlYgKDYyKSBhbmQgSUJVICgxMDA1KSBvbmx5LiAKQ2xlYW5pbmcgZGF0YSBpbiBtdWx0aXBsZSBvcHRpb25zOgoxLiBjb21wbGV0ZSByZWNvcmRzIG9ubHkKMi4gcmVwbGFjaW5nIE5BIHdpdGggdGhlIGF2ZXJhZ2VzIG9mIHRoZSByZW1haW5kZXIgb2YgdGhlIGNvbHVtbgoKYGBge3J9CiMgc2VsZWN0aW5nIG9ubHkgY29tcGxldGUgY2FzZXMKd2hpY2goaXMubmEoYnJld2VkX2JlZXIpKSAjIGRldGVybWluZSB3aGljaCByb3dzIGNvbnRhaW4gTkEKY29sU3Vtcyhpcy5uYShicmV3ZWRfYmVlcikpICMgc3VtbWFyeSBvZiB0aGUgbnVtYmVyIG9mIE5BIGluIGVhY2ggY29sdW1uCmNvbXBsZXRlZF9iZWVyIDwtIGJyZXdlZF9iZWVyW2NvbXBsZXRlLmNhc2VzKGJyZXdlZF9iZWVyKSwgXSAjIGRmIHdpdGggb25seSBjb21wbGV0ZSByZWNvcmRzCgojIHJlcGxhY2luZyBOQSB3aXRoIGF2ZXJhZ2VzCmF2ZXJhZ2VkX2JlZXIgPC0gYnJld2VkX2JlZXIgIyBtYWtlIGEgZHVwbGljYXRlIG9mIHRoZSBvcmlnaW5hbCBkZiB0byBtYW5pcHVsYXRlCmF2ZXJhZ2VkX2JlZXIkQUJWIDwtIG1lYW4oYXZlcmFnZWRfYmVlciRBQlYsIG5hLnJtID0gVFJVRSkKYXZlcmFnZWRfYmVlciRJQlUgPC0gbWVhbihhdmVyYWdlZF9iZWVyJElCVSwgbmEucm0gPSBUUlVFKQpjb2xTdW1zKGlzLm5hKGF2ZXJhZ2VkX2JlZXIpKQoKYGBgCgojIyMgTWVkaWFuIEFsY29ob2wgQ29udGVudApNZWRpYW4gb2YgQWxjb2hvbCBieSBWb2x1bWUgYW5kIEJpdHRlcm5lc3MgYnkgU3RhdGUKYGBge3J9CiMgZ3JvdXAgYnkgc3RhdGUsIGdldCBtZWRpYW4gb2YgQUJWLCBJQlUKbWVkaWFucyA8LSBhcy5kYXRhLmZyYW1lKGFnZ3JlZ2F0ZShjb21wbGV0ZWRfYmVlclssYyg3LDgpXSwgYnk9bGlzdChjb21wbGV0ZWRfYmVlciRTdGF0ZSksIEZVTj1tZWRpYW4pKSAKbWVkaWFuX2RmIDwtIHJlbmFtZShtZWRpYW5zLCBjKCJHcm91cC4xIj0iU3RhdGUiKSkgIyByZW5hbWUgdGhlIGNvbHVtbiB0byBTdGF0ZQoKbWVkaWFuX2dyYXBoIDwtIG1lZGlhbl9kZiAlPiUgZ2dwbG90KGFlcyh4ID0gQUJWLCB5ID0gSUJVLCBjb2xvcj1TdGF0ZSkpICsgZ2VvbV9wb2ludCgpICsgZ2d0aXRsZSgiTWVkaWFuIEFsY29ob2wgQ29udGVudCBhbmQgQml0dGVybmVzcyBieSBTdGF0ZSIpICMgcGxvdCBzY2F0dGVyIHBsb3QKCkFCVl9iYXIgPC1nZ3Bsb3QoZGF0YT1tZWRpYW5fZGYsIGFlcyh4ID0gU3RhdGUsIHkgPSBBQlYsIGZpbGwgPSBTdGF0ZSkpICsKICBnZW9tX2JhcihzdGF0PSJpZGVudGl0eSIsIHdpZHRoID0gMC43NSkgKyB0aGVtZShheGlzLnRleHQueCA9IGVsZW1lbnRfdGV4dChhbmdsZSA9IDkwLCBoanVzdCA9IDEpKSArIGdndGl0bGUoIk1lZGlhbiBBQlYgYnkgU3RhdGUiKQpnZ3Bsb3RseShBQlZfYmFyKQoKSUJVX2JhciA8LWdncGxvdChkYXRhPW1lZGlhbl9kZiwgYWVzKHggPSBTdGF0ZSwgeSA9IElCVSwgZmlsbCA9IFN0YXRlKSkgKwogIGdlb21fYmFyKHN0YXQ9ImlkZW50aXR5Iiwgd2lkdGggPSAwLjc1KSArIHRoZW1lKGF4aXMudGV4dC54ID0gZWxlbWVudF90ZXh0KGFuZ2xlID0gOTAsIGhqdXN0ID0gMSkpICsgZ2d0aXRsZSgiTWVkaWFuIElCVSBieSBTdGF0ZSIpCmdncGxvdGx5KElCVV9iYXIpCgpgYGAKIyMjIE1heCBBQlYgYW5kIElCVQpXaGljaCBzdGF0ZSBoYXMgdGhlIG1heGltdW0gYWxjb2hvbGljIChBQlYpIGJlZXI/IFdoaWNoIHN0YXRlIGhhcyB0aGUgbW9zdCBiaXR0ZXIgKElCVSkgYmVlcj8KYGBge3J9CiMgZmluZCB0aGUgbWF4IEFCViBhbmQgSUJVIGZyb20gZWFjaCBzdGF0ZQptYXhpbXVtcyA8LSBhcy5kYXRhLmZyYW1lKGFnZ3JlZ2F0ZShjb21wbGV0ZWRfYmVlclssYyg3LDgpXSwgYnk9bGlzdChjb21wbGV0ZWRfYmVlciRTdGF0ZSksIEZVTj1tYXgpKQptYXhfZGYgPC0gcmVuYW1lKG1heGltdW1zLCBjKCJHcm91cC4xIj0iU3RhdGUiKSkgIyByZW5hbWUgdGhlIGNvbHVtbiB0byBTdGF0ZQoKI2ZpbmQgdGhlIG1heCBBQlYgYW5kIElCVSBzdGF0ZXMKbWF4X0FCViA8LSBtYXhfZGYgJT4lIGZpbHRlcihBQlYgPT0gbWF4KEFCVikpCm1heF9JQlUgPC0gbWF4X2RmICU+JSBmaWx0ZXIoSUJVID09IG1heChJQlUpKQoKbWF4X3N0YXRlIDwtIHJiaW5kKG1heF9BQlYsIG1heF9JQlUpCm1heF9zdGF0ZQoKYGBgCgojIyMgU3VtbWFyeSBTdGF0aXN0aWNzClRoZSBzdW1tYXJ5IHN0YXRpc3RpY3MgYW5kIGRpc3RyaWJ1dGlvbiBvZiB0aGUgQUJWIHZhcmlhYmxlLgoKYGBge3J9CgpgYGAKCiMjIyBSZWxhdGlvbnNoaXAgYmV0d2VlbiBCaXR0ZXJuZXNzIGFuZCBBbGNvaG9sIENvbnRlbnQKSXMgdGhlcmUgYW4gYXBwYXJlbnQgcmVsYXRpb25zaGlwIGJldHdlZW4gdGhlIGJpdHRlcm5lc3Mgb2YgdGhlIGJlZXIgYW5kIGl0cyBhbGNvaG9saWMgY29udGVudD8gRHJhdyBhIHNjYXR0ZXIgcGxvdC4gIE1ha2UgeW91ciBiZXN0IGp1ZGdtZW50IG9mIGEgcmVsYXRpb25zaGlwIGFuZCBFWFBMQUlOIHlvdXIgYW5zd2VyLgpgYGB7cn0KCmBgYAoKIyMjIElQQXMgdnMuIEFsZXMKQnVkd2Vpc2VyIHdvdWxkIGFsc28gbGlrZSB0byBpbnZlc3RpZ2F0ZSB0aGUgZGlmZmVyZW5jZSB3aXRoIHJlc3BlY3QgdG8gSUJVIGFuZCBBQlYgYmV0d2VlbiBJUEFzIChJbmRpYSBQYWxlIEFsZXMpIGFuZCBvdGhlciB0eXBlcyBvZiBBbGUgKGFueSBiZWVyIHdpdGgg4oCcQWxl4oCdIGluIGl0cyBuYW1lIG90aGVyIHRoYW4gSVBBKS4gIFlvdSBkZWNpZGUgdG8gdXNlIEtOTiBjbGFzc2lmaWNhdGlvbiB0byBpbnZlc3RpZ2F0ZSB0aGlzIHJlbGF0aW9uc2hpcC4gIFByb3ZpZGUgc3RhdGlzdGljYWwgZXZpZGVuY2Ugb25lIHdheSBvciB0aGUgb3RoZXIuIFlvdSBjYW4gb2YgY291cnNlIGFzc3VtZSB5b3VyIGF1ZGllbmNlIGlzIGNvbWZvcnRhYmxlIHdpdGggcGVyY2VudGFnZXMg4oCmIEtOTiBpcyB2ZXJ5IGVhc3kgdG8gdW5kZXJzdGFuZCBjb25jZXB0dWFsbHkuCgpJbiBhZGRpdGlvbiwgd2hpbGUgeW91IGhhdmUgZGVjaWRlZCB0byB1c2UgS05OIHRvIGludmVzdGlnYXRlIHRoaXMgcmVsYXRpb25zaGlwIChLTk4gaXMgcmVxdWlyZWQpIHlvdSBtYXkgYWxzbyBmZWVsIGZyZWUgdG8gc3VwcGxlbWVudCB5b3VyIHJlc3BvbnNlIHRvIHRoaXMgcXVlc3Rpb24gd2l0aCBhbnkgb3RoZXIgbWV0aG9kcyBvciB0ZWNobmlxdWVzIHlvdSBoYXZlIGxlYXJuZWQuICBDcmVhdGl2aXR5IGFuZCBhbHRlcm5hdGl2ZSBzb2x1dGlvbnMgYXJlIGFsd2F5cyBlbmNvdXJhZ2VkLiAgCmBgYHtyfQoKYGBgCgojIyMgQWRkaXRpb25hbCBpbmZlcmVuY2VzCktub2NrIHRoZWlyIHNvY2tzIG9mZiEgIEZpbmQgb25lIG90aGVyIHVzZWZ1bCBpbmZlcmVuY2UgZnJvbSB0aGUgZGF0YSB0aGF0IHlvdSBmZWVsIEJ1ZHdlaXNlciBtYXkgYmUgYWJsZSB0byBmaW5kIHZhbHVlIGluLiAgWW91IG11c3QgY29udmluY2UgdGhlbSB3aHkgaXQgaXMgaW1wb3J0YW50IGFuZCBiYWNrIHVwIHlvdXIgY29udmljdGlvbiB3aXRoIGFwcHJvcHJpYXRlIHN0YXRpc3RpY2FsIGV2aWRlbmNlLiAKYGBge3J9CgpgYGAKCgoKCgoKCgoKCg==